Adaptive High Fidelity Reproduction System for Object-Based Audio

ABSTRACT

Object-based audio is adaptively associated with speakers, depending on the speaker configuration that is present. Each speaker it receives an audio assignment based on its individual spectral characteristics. As more speakers are added, content is adaptively associated with that you speaker, and taken away from the previous.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 14/165,710 filed Jan. 28, 2014, which is a continuation of U.S. application Ser. No. 12/482,450 filed Jun. 10, 2009, now U.S. Pat. No. 8,638,647 issued Jan. 28, 2014, which is a divisional of U.S. Ser. No. 11/609,396 filed Dec. 12, 2006, now U.S. Pat. No. 7,561,697 issued Jul. 14, 2009, which is a continuation application of U.S. Ser. No. 10/848,993 filed May 18, 2004, now U.S. Pat. No. 7,154,819 issued Dec. 26, 2006, which is a divisional application of U.S. Ser. No. 09/799,460 filed Mar. 5, 2001, now U.S. Pat. No. 6,738,318 issued May 18, 2004, the disclosures of each of these parent applications and their file histories in the patent office are hereby incorporated by reference, in their entirety.

BACKGROUND

High fidelity systems attempt to simulate the sound that comes from actual sound-producing objects. Real music is produced when each of a plurality of different instruments, at a different location, produces its own unique sound. Each instrument also has unique sonic tuning characteristics. The real music is produced from these instruments, at different locations, producing sounds. Producing a simulation of this real music is the objective of a high fidelity music reproduction system.

Movies, in contrast, actually have a different objective for their sound production. In the 1980s, movie sound became a format with multiple channels providing the sound output. This format, called surround sound, produced five or more channels of sound. The channels included left and right main channels for stereo music. A center channel was used for mono parts of the reproduction such as the voice. In addition, left and right surround channels were provided for special effects. In addition, additional channels may be provided for sound having special characteristics such as sub woofers. This sound system attempts to produce the feeling of actually being part of the action depicted by the movie.

SUMMARY

The present inventor believes that an ideal musical reproduction, like real music, should produce the sound from a plurality of instruments, each coming from its own tuned source that has tuning/music reproduction characteristics that is most closely representative of the instrument. The current system of stereo reproduction reproduces most, if not all, instruments, from two different sources (speakers), both of which are tuned the same.

According to the present system, information is produced for reproduction by music reproduction hardware. The information as produced has a number of separated parts. That is, each stream of audio information, such as a song, may have separated parts that form that stream. In one embodiment, those parts may be tracks on the audio reproduction medium.

The separated parts are adaptively associated with different music reproduction hardware based on the actual characteristics of the hardware producing the music. That is, for example, the violin sounds may be produced by the speaker most closely tuned to violins. Another speaker, e.g., most closely tuned to horns, may reproduce the horns.

Another aspect automatically determines specific characteristics of the hardware, and forms a file indicative of those specific characteristics of the hardware. The contents of that file is used to adaptively associate the content of the media, e.g., the music, with the hardware.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects of the invention will be described in detail with reference to the accompanying drawings, wherein:

FIG. 1 shows a block diagram of the system;

FIG. 2 shows a wired connection between amplifier and speaker;

FIG. 3 shows a wireless connection;

FIG. 4 shows a flowchart of operation;

FIGS. 5A-5D show different ways of storing information for use in the present application.

DETAILED DESCRIPTION

Modern audio reproduction media may have more space for storing the data that is indicative of the audio. For example, the so-called DVD may have 100 times the storage capability of a standard CD, also known as a “Redbook” format CD. Various enhanced CD formats have also been suggested which provide more data on the media and that can be used to store more information. In addition, modern compression formats, such as the MP3 format, allows reducing the size that is occupied by information placed on the media. Again, this has the effect of allowing the media to store more information.

Broadband channels are also available. For example, satellite radio channels are proposed. Broadband Internet channels have been used. In addition, audio content may also be produced over a cable and the like.

The present system will be described with reference to information being stored on the audio medium. A plurality of tracks are provided on the medium. The tracks each include information about a different aspect of the audio stream that is recorded on the medium. The audio medium is shown here as being a disk, but it could be understood that any different kind of audio containing medium could be used with the system of the present invention.

Each track may represent a specified kind of information. In one aspect, each track includes information about the same kind of instruments. The instruments included on a single track may be of the same time, e.g. all violins, or may have the same spectral characteristics, that is all string instruments, or all producing output within a specified spectral region, or primarily within a specified spectral region. Sounds may be grouped based on spectral characteristics, e.g., by using a fast Fourier transform on recorded sound from the instrument. Each instrument may be characterized on the spectrum, e.g. by forming a histogram indicating that amount of energy in each spectral bucket. Alternatively, instruments or sounds which may effectively compress may be grouped together. The instruments which are sufficiently similar may be grouped together as a track. This has a number of advantages in the context of the present system. First of all, it makes the information on the track more compressible by certain compression systems such as MP3, since each instrument on the track has similar characteristics. In addition, on readout, the track can be accurately reproduced by the same kind of reproduction equipment.

Multiple tracks are placed on the medium for different purposes. For example, a single medium may include movie style tracks such as left, right, center, left surround and right surround, and also a subwoofer setting. The left and right tracks on the medium represent the stereo information. The remaining information in the tracks may represent information from different individual instruments or instrument types. This information may include separate tracks for each of voices, strings, winds, guitar, percussion, bass strings, and bass winds, with the understanding that a each different instrument may also be broken up based on its characteristics e.g. bass or treble. The above has described 13 tracks for each stored item of information. It should be understood, however, that there may be fewer or more tracks, e.g. up to 20 tracks. Since each track may represent information of the specified instrument type, the information in the track may be highly compressible.

As can be seen from the above, the medium will typically include more information that is necessary to actually playback the audio on any system. For example, the medium may include stereo left and stereo right channels. However, on some systems, 10 speakers may be provided for different instrument types, and this information includes parts of the information that is also within the left and right stereo. If the separated channels are used, the audio left and audio right information might not be used. Therefore, the audio medium may include redundant information. Adaptive decisions are made during playback indicating which speakers and or which music reproduction equipment gets which content.

An embodiment is shown in FIG. 1. A disk 100 includes a plurality of tracks of information. For example, if the tracks above are used, the stream, shown as 110, may include 13 different channels. The medium may also include control track 105 which may be a data track including information about which tracks on the medium include which information.

The medium is read out by a player. The contents of the medium is interpreted by the adaptive element that is either in the player, or in a controller or amplifier associated with the player. The adaptive element is shown herein as 150, and as being part of the amplifier.

The amplifier is connected to a plurality of different speakers or different amplified speakers. Each speaker system, such as 155, has specified spectral and/or other sound producing characteristics. In an embodiment, each speaker may also be active, in the sense that it includes an electronic module associated with the speaker. That electronic module allows communication with the speaker, and may include information about the speaker's characteristics. In another embodiment, characteristics of the speakers may be obtained in a different way.

The characteristics of the speaker may be communicated to the memory 165 over the speaker wire using serial formats such as universal serial bus, or RS 232 for example. Alternatively, the amplifier 150 may include a medium reading capability shown as 170. This reading capability may read a storage medium, such as a floppy disk, memory stick, CD, or mini CD which is inserted therein. The medium includes information about the speakers, which is then read from the medium, and stored in the memory. Another way of communicating information is to obtain characteristics from a public network such as the Internet.

In another aspect, each speaker that is purchased is provided with an audio medium such as a CD or DVD. That audio medium is intended to be played in the CD player associated with the stereo. The contents of the CD are played as normal CD audio. However, electronic information about the speakers is encoded in the CD audio. Thus, this includes a specified code that can be read by the amplifier 150, and indicates that speaker information follows. The following information includes speaker information.

The main amplifier 150 includes also a processor 170 which makes adaptive decisions about which speakers will be selected to play each track or channel on the medium. This adaptive decision will be based on the specific characteristics of the speakers, and the specific characteristics of the audio. The decision is based on, of course, the specific hardware which is present in the system. More hardware, actually more speakers, in the system, will enable better sound. When fewer speakers are present, tracks will need to be combined. In the minimum configuration, only two speakers are present, and the standard stereo is played. Each time a speaker is added, it receives multiple tracks assigned to be played to it, based on its spectral characteristics. This enables the user to make purchases based on their preferences. The user who likes the sound of strings, for example, may purchase a speaker that is tuned to strings. When this speaker is added to the amplifier system 150, its characteristics are stored in memory 165. Playing of media will thereafter assign information from the media 100 to those speakers, based on the speakers characteristics. Conversely, other speakers for horns, and other instruments may also be purchased. Each speaker is adaptively associated with the content for those speakers. Each extra speaker is assigned with sound, and that sound is hence not played by the other speakers. Therefore, more speakers allow better reproduction of the sound.

Different ways of getting the information into the memory are also considered. FIG. 2 illustrates up plug and play type operation of doing this. In FIG. 2, the amplifier 250 is connected via a standard line connection to the speaker 260. The speaker 260 includes an electronics module therein 265. The module 265 communicates with a corresponding module in the amplifier, using any serial protocol but preferably Ethernet, USB, or RS 232. Any protocol that may communicate over a 2 wire line may be used. In this embodiment, the amplifier may poll the speaker using a low voltage level signal. Since the signal is at a low voltage level, it will produce little if any sound out of the speaker. However, the electronics module 265 within the speaker may still recognize this as control signals. The speaker responds with information indicative of its individual spectral characteristics. This information is then stored in the memory 165 within the amplifier. The information may also be used in the playback mode, to determine channel allocations for the information from the media.

A wireless alternative is shown in FIG. 3. This may use wireless formats such as bluetooth, wireless LAN, or some other wireless format. FIG. 3 shows a bluetooth module 310 in the speaker 300. The amplifier 350 also includes a bluetooth module shown as 355. Again, this system may operate by polling. The speaker may respond to a poll with information indicative of the speaker's individual characteristics. This information is then stored in the memory 165.

In any of these embodiments, the user can purchase more speakers at any time. Settings for the music are automatically determined by the characteristics of the speaker.

The above-described operations may operate according to the flowchart of FIG. 4, which may run in the processor 170.

At 400, the system polls all speakers. This may be carried out at each time of power on, or may be carried out only once for example on initial connection. The speakers may also include the capability of determining room acoustics, in which case it may be desirable to poll the speakers at each power up, or at time intervals.

At 405, the system determines settings based on the polling. These settings may optionally be displayed at 410. At 415, the content of the tracks is adaptively associated with the user's individual stereo setup.

The above has described the information stored on the medium. This “enhanced” information may be stored on the media in a number of different ways.

FIG. 5A shows the medium being a disk with a first portion that has normal CD stereo 500, that can be read by any CD player, and reproduced through normal stereo equipment. A second, enhanced portion of the disk 505 includes multitrack enhanced information. Since the first portion is then typical CD form, this setup will require that the medium have additional space available. An advantage of this system is that the medium can be read on any standard CD player.

FIG. 5B shows another system in which the entire medium is stored in multitrack format. In this system, the standard stereo information is interleaved with other tracks of additional information. Standard CD format includes headers that are specified by the standard. These headers include information such as P and Q parts. These headers include signals that instruct a standard CD player to ignore certain parts of the data stream that is stored on the disk; those parts being reproduced only by enhanced players. For example, CDs may include capability of quad reproduction, and the enhanced information tracks could be labeled as quad, so that a standard player ignores this information.

FIG. 5C shows another alternative in which a dual cited disk has a first side 520 representing normal information and a second side 525 which is an enhanced disk.

While the above has described the information being present on the disk, it should be understood that other forms of reproduction and obtaining of information are possible. All such forms are intended to be encompassed.

Other embodiments are within the disclosed invention. 

What is claimed is:
 1. A system for reproducing object-based audio, the system comprising: a controller for receiving an input audio signal and generating a plurality of output audio signals, wherein the input audio signal includes at least one audio object; a plurality of speakers coupled to the controller for producing sound in response to the plurality of output audio signals; and a memory coupled to or integrated with the controller for storing one or more sonic characteristics of the plurality of speakers, wherein the controller determines which of the plurality of output audio signals to send to which of the plurality of speakers based at least in part on the sonic characteristics.
 2. The system of claim 1 wherein the one or more sonic characteristics include power handling capabilities.
 3. The system of claim 1 wherein the one or more sonic characteristics include frequency response.
 4. The system of claim 1 wherein the one or more sonic characteristics include the maximum power handling.
 5. The system of claim 1 wherein the one or more sonic characteristics include the amount of bass capable of being generated.
 6. The system of claim 1 wherein the one or more sonic characteristics include the amount of treble capable of being generated.
 7. The system of claim 1 wherein the one or more sonic characteristics include spectral characteristics.
 8. The system of claim 1 wherein the at least one audio object includes an audio signal and accompanying location information.
 9. A system for reproducing object-based, the system comprising: a plurality of speakers for producing sound, wherein at least one of the plurality of speakers is dedicated to producing bass; a processor for analyzing one or more input audio signal and generating one or more output audio signals; and a storage coupled to or integrated with the processor for storing one or more sonic characteristics of the plurality of speakers, wherein the controller sends output audio signals having low frequency components to the at least one of the plurality of speakers dedicated to producing bass based at least in part on the one or more sonic characteristics and wherein the one or more input audio signals includes at least one audio object.
 10. The system of claim 9 wherein the one or more sonic characteristics include power handling capabilities.
 10. The system of claim 9 wherein the one or more sonic characteristics include frequency response.
 11. The system of claim 9 wherein the one or more sonic characteristics include the maximum power handling.
 12. The system of claim 9 wherein the one or more sonic characteristics include the amount of bass capable of being generated.
 13. The system of claim 9 wherein the one or more sonic characteristics include the amount of treble capable of being generated.
 14. The system of claim 9 wherein the one or more sonic characteristics include spectral characteristics.
 14. The system of claim 9 wherein the at least one audio object includes an audio signal and accompanying location information.
 15. A method for reproducing audio, comprising: receiving one or more input audio signals; generating a plurality of output audio signals based on the one or more input audio signals; storing one or more sonic characteristics of a plurality of audio reproduction equipment; and determining which of the plurality of output audio signals to send to which of the audio reproduction equipment based at least in part on the sonic characteristics, wherein the one or more input audio signals includes at least one audio object that comprises an audio signal and accompanying location information. 